COFFEE: an objective function for multiple sequence alignments

نویسندگان

  • Cédric Notredame
  • Liisa Holm
  • Desmond G. Higgins
چکیده

MOTIVATION In order to increase the accuracy of multiple sequence alignments, we designed a new strategy for optimizing multiple sequence alignments by genetic algorithm. We named it COFFEE (Consistency based Objective Function For alignmEnt Evaluation). The COFFEE score reflects the level of consistency between a multiple sequence alignment and a library containing pairwise alignments of the same sequences. RESULTS We show that multiple sequence alignments can be optimized for their COFFEE score with the genetic algorithm package SAGA. The COFFEE function is tested on 11 test cases made of structural alignments extracted from 3D_ali. These alignments are compared to those produced using five alternative methods. Results indicate that COFFEE outperforms the other methods when the level of identity between the sequences is low. Accuracy is evaluated by comparison with the structural alignments used as references. We also show that the COFFEE score can be used as a reliability index on multiple sequence alignments. Finally, we show that given a library of structure-based pairwise sequence alignments extracted from FSSP, SAGA can produce high-quality multiple sequence alignments. The main advantage of COFFEE is its flexibility. With COFFEE, any method suitable for making pairwise alignments can be extended to making multiple alignments. AVAILABILITY The package is available along with the test cases through the WWW: http://www. ebi.ac.uk/cedric CONTACT [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of scoring functions for protein multiple sequence alignment using structural alignments

The process of aligning a group of protein sequences to obtain a meaningful Multiple Sequence Alignment (MSA) is a basic tool in current bioinformatic research. The development of new MSA algorithms raises the need for an efficient way to evaluate the quality of an alignment, in order to select the best alignment among the ones produced by the available algorithms. A natural way to evaluate the...

متن کامل

Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns

MOTIVATION Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, abo...

متن کامل

R-Coffee: a web server for accurately aligning noncoding RNA sequences

The R-Coffee web server produces highly accurate multiple alignments of noncoding RNA (ncRNA) sequences, taking into account predicted secondary structures. R-Coffee uses a novel algorithm recently incorporated in the T-Coffee package. R-Coffee works along the same lines as T-Coffee: it uses pairwise or multiple sequence alignment (MSA) methods to compute a primary library of input alignments. ...

متن کامل

R-Coffee: a method for multiple alignment of non-coding RNA

R-Coffee is a multiple RNA alignment package, derived from T-Coffee, designed to align RNA sequences while exploiting secondary structure information. R-Coffee uses an alignment-scoring scheme that incorporates secondary structure information within the alignment. It works particularly well as an alignment improver and can be combined with any existing sequence alignment method. In this work, w...

متن کامل

The iRMSD: a local measure of sequence alignment accuracy using structural information

MOTIVATION We introduce the iRMSD, a new type of RMSD, independent from any structure superposition and suitable for evaluating sequence alignments of proteins with known structures. RESULTS We demonstrate that the iRMSD is equivalent to the standard RMSD although much simpler to compute and we also show that it is suitable for comparing sequence alignments and benchmarking multiple sequence ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 14 5  شماره 

صفحات  -

تاریخ انتشار 1998